About Unsolved Questions

Benchmarks shape progress in AI research. A useful benchmark should be both difficult and realistic: questions should challenge frontier model while also reflecting real-world usage. Yet, current paradigms face a difficulty–realism tension: exam-style benchmarks are often made artificially difficult with limited real-world value, while benchmarks based on real user interaction often skew toward easy, high-frequency problems.

This work explores a radically different paradigm: assessing models on unsolved questions. Rather than a static benchmark scored once, we curate unsolved questions and evaluate models asynchronously over time with validator-assisted screening and community verification. We introduce UQ, a testbed of 500 challenging, diverse questions sourced from Stack Exchange, spanning topics from CS theory and math to less explored areas like sci-fi and history, probing capabilities including reasoning, factuality, and browsing. UQ is difficult and realistic by construction: unsolved questions are often hard and naturally arise when humans seek answers, thus solving them yields direct real-world value.

All: 500 questions
Technology: 52 questions
Culture & Recreation: 16 questions
Life & Arts: 35 questions
Science: 395 questions
Model Performance Leaderboard
Models are ranked by number of questions that pass human verification.
RankSystemOrganization
UQ-Validator
Pass Rate
All QuestionsTechnologyCulture & RecreationLife & ArtsScience
#1
o3 ProOpenAI75 / 500 (15.0%)4 / 500*0 / 520 / 160 / 354 / 395
#2
Gemini 2.5 ProGoogle25 / 500 (5.0%)3 / 500*0 / 520 / 160 / 353 / 395
#3
o4 miniOpenAI25 / 500 (5.0%)2 / 500*0 / 520 / 160 / 352 / 395
#4
o3OpenAI44 / 500 (8.8%)1 / 500*1 / 520 / 160 / 350 / 395
#5
DeepSeek R1DeepSeek11 / 500 (2.2%)1 / 500*0 / 520 / 160 / 351 / 395
#6
GPT-5OpenAI88 / 500 (17.6%)0 / 500*0 / 520 / 160 / 350 / 395
#7
Claude Opus 4Anthropic7 / 500 (1.4%)0 / 500*0 / 520 / 160 / 350 / 395
#8
Claude 3.7 SonnetAnthropic6 / 500 (1.2%)0 / 500*0 / 520 / 160 / 350 / 395
Total Questions
500
Models Evaluated
8
Questions Solved by Models
10
Stack Exchange URL Mirroring
Access UQ questions directly using Stack Exchange URLs

Found an interesting Stack Exchange unsolved question? You can check if it's in our UQ dataset by modifying the URL:

Original Stack Exchange URL:

https://math.stackexchange.com/questions/358423
becomes

UQ Mirrored URL:

https://uq.stanford.edu/q/math.stackexchange.com/questions/358423

✅ If question exists in UQ:

You'll be automatically redirected to the UQ question page with model answers and analysis.

📝 If question not found:

You can submit a request to have it considered for inclusion in our dataset.

Tip: Both short URLs (without title) and full URLs (with title) work the same way!

Top Questions

The most popular questions from the UQ Project based on Stack Exchange votes

873
SE votes

A proof of without prime ideals?

Unsolved

Background. If is a commutative ring, it is easy to prove , where denotes the Krull dimension. If is Noetherian, we have equality. Every proof of this fac...

Science
Mathematics
ring-theory
commutative-algebra
noetherian
+2 more
Posted on:4/11/2013
UQ ID:256
632
SE votes

Is there a bijection of with itself such that the forward map is connected but the inverse is not?

Unsolved

Let be two topological spaces. We say that a map between their power sets is connected if for every connected, ...

Science
Mathematics
general-topology
metric-spaces
examples-counterexamples
+1 more
Posted on:9/30/2014
UQ ID:257
185
SE votes

Given a finite extension of the rationals, , we know that by the primitive element theorem, so every has the form ...

Science
Mathematics
abstract-algebra
number-theory
ring-theory
+1 more
Posted on:4/22/2016
UQ ID:258
142
SE votes

Let be a field. Say that polynomials are almost surjective over if for any nonconstant polynomial , the image of the map contains all but finitely many points of ....

Science
Mathematics
abstract-algebra
polynomials
field-theory
Posted on:5/19/2016
UQ ID:259
135
SE votes

Probability for an matrix to have only real eigenvalues

Unsolved

Let be an random matrix where every entry is i.i.d. and uniformly distributed on . What is the probability that has only real eigenvalues? The answer cannot be or , s...

Science
Mathematics
linear-algebra
probability
matrices
+2 more
Posted on:7/27/2020
UQ ID:260
122
SE votes

I was curious about the sum of two consecutive primes and after proving that the sum for the odd primes always has at least 3 prime divisors, I came up with this question:

Find the least natural numb...

Science
Mathematics
number-theory
prime-numbers
prime-gaps
Posted on:10/15/2013
UQ ID:261
112
SE votes

Say that the perimeter of a polyhedron is the sum of its edge lengths. What is the maximum volume of a polyhedron with a unit perimeter? A reasonable first guess would be the regular tetrahedron of si...

Science
Mathematics
geometry
optimization
volume
+2 more
Posted on:3/1/2021
UQ ID:262
94
SE votes

Can Erdős-Turán theorem be generalised that way?

Unsolved

Suppose for an arbitrary group word ower the alphabet of symbols is a variety of all groups , that satisfy an identity ....

Science
Mathematics
combinatorics
group-theory
finite-groups
+2 more
Posted on:1/11/2019
UQ ID:263
92
SE votes

Does there exist a complete, finitely axiomatizable, first-order theory with exactly 3 countable non-isomorphic models? A few relevant comments: There is a classical example of a complete theory w...

Science
Mathematics
logic
model-theory
Posted on:8/29/2014
UQ ID:264
85
SE votes

Regular way to fill a square with rectangles?

Unsolved

The series suggests it might be possible to tile a square with nonrepeated rectangles of the form . Is there a know...

Science
Mathematics
sequences-and-series
visualization
egyptian-fractions
Posted on:2/24/2015
UQ ID:265

News

  • [08/2025] Released Unsolved Questions (UQ) Paper
Contact

For questions about the project:

{"niefan, kzliu, niklasm"}@stanford.edu

For technical issues:

niefan@stanford.edu

Contact Us
Cite

If you use UQ: Assessing Language Models on Unsolved Questions, please cite:

@misc{nie2025uqassessinglanguagemodels,
  title={UQ: Assessing Language Models on Unsolved Questions}, 
  author={Fan Nie and Ken Ziyu Liu and Zihao Wang and Rui Sun and Wei Liu and Weijia Shi and Huaxiu Yao and Linjun Zhang and Andrew Y. Ng and James Zou and Sanmi Koyejo and Yejin Choi and Percy Liang and Niklas Muennighoff},
  year={2025},
  eprint={2508.17580},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2508.17580}
}
Full Citation